Logo Logo Cranfield
Analysing microbial communities from RestREco project

Abstract

As ecosystems are facing challenges from anthropogenic sources including land use changes and habitat destruction, methods for their re-establishment are being investigated.

Assisting the recovery of and building resilience in ecosystems requires the study of how ecological communities respond to different environmental conditions including drought. According to the UK Centre for Ecology and Hydrology, grasslands make up 40% of the UK’s land surface while grassland restoration has been identified as a means of increasing biodiversity and reducing carbon emissions. Soils have been identified as one of the most biodiverse habitats, and are home to between 44% and 74% of all species on Earth. Soil microbial communities, including bacteria and fungi, provide ecosystem services such as nutrient cycling, decomposition and regulating carbon storage, which are affected by changes in soil moisture. According to the Environment Agency, soil moisture deficits across the UK have increased in March and April 2025. Additionally more frequent droughts are predicted due to climate change, therefore a greater understanding of how soil microbial communities respond to changes in soil moisture is key to building resilience in grassland ecosystems. The aim of this study is to investigate how microbial species diversity, community composition and relationships between bacterial and fungal communities in grassland soils respond to changes in soil moisture levels. DNA sequences from bacterial 16S rRNA and fungal ITS from soil samples collected by the Restoring Resilient Ecosystems (RestREco) ecological restoration project will be analysed using bioinformatics tools. The methods will include running computational analysis tools using the Crescent2 high performance computing facilities and/or RStudio, testing hypotheses using statistical tests and placing findings in the context of current literature in scientific journals.

About the RestREco project

RestREco (Restoring Resilient Ecosystems) is a NERC-funded research project that aims to identify ecological restoration practices that can build resilience in ecosystems, particularly woodlands and grasslands. The initiative brings together researchers from:

  • Cranfield University
  • University of Stirling
  • UK Centre for Ecology & Hydrology
  • The National Trust
  • Forest Research

RestREco studies a network of 133 ecological restoration sites across England and Scotland. The project splits its work into seven work packages, each focusing on different factors and relating them to ecosystem characteristics that arise from component interactions. It aims to identify key drivers of ecosystem development, including:

  • Time since restoration began
  • Initial ecological conditions
  • Proximity to existing woodland and grassland

The goal is to understand how these factors influence ecosystem complexity, function, and resilience to future pressures (RestREco, 2024).


About this study

This study uses targeted amplicon sequences (16S, ITS), retrieved from soil samples, to determine how drought affects soil microbial communities, specifically bacteria and fungi, in UK grasslands.

This study specifically focuses on:

  • Alpha and beta diversity
  • Taxonomic composition
  • Functional diversity


Sample design and metadata

A total of 72 soil samples were collected across six sites in England for each marker (6 per site: 3 control, 3 shelter).

Some fungal samples failed the initial quality control during the sequencing process, resulting in the removal of one site from the analysis (for fungi only), while other sites where one or two samples failed quality, were kept in (also for fungi).

Sample Zone - Based on GPS coordinates

Sample overview
Metric 16S ITS
Microbial group Bacteria Fungi
Region sampled England England
Number of sites 6 5
Samples per site 6 6
Total samples removed 0 3
Total samples 36 27
Total samples per treatment 18 13 (for most sites)
Average reads per sample ~66 000 ~61 000

Sampling Summary



Analysing soil moisture between control and shelter

Treatment - overall

Treatment - site-by-site

Import, filter and check

QIIME2 generated feature table, taxonomy file, metadata and rooted phylogenetic tree were imported into R (insert citation) using qiime2r and objects for bacteria and fungi were generated using R phyloseq library (insert citation). General statistics were generated using the microbiome package (insert citation).

Bacteria

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 28079 taxa and 36 samples ]
## sample_data() Sample Data:       [ 36 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 28079 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 28079 tips and 28055 internal nodes ]
## [[1]]
## [1] "1] Min. number of reads = 13222"
## 
## [[2]]
## [1] "2] Max. number of reads = 32667"
## 
## [[3]]
## [1] "3] Total number of reads = 865814"
## 
## [[4]]
## [1] "4] Average number of reads = 24050.3888888889"
## 
## [[5]]
## [1] "5] Median number of reads = 23675.5"
## 
## [[6]]
## [1] "7] Sparsity = 0.9460915828753"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
## 
## [[8]]
## [1] "8] Number of singletons = 951"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)3.38687275187863"
## 
## [[10]]
## [1] "10] Number of sample variables are: 4"
## 
## [[11]]
## [1] "Site"      "Field_no"  "Treatment" "Group_by"

Fungi

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 6327 taxa and 25 samples ]
## sample_data() Sample Data:       [ 25 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 6327 taxa by 7 taxonomic ranks ]

Bacteria

The bacteria feature table was filtered to remove anything assigned as Archaea or unassigned.

## 
##  d__Archaea d__Bacteria  Unassigned 
##          19       28055           5
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 28055 taxa and 36 samples ]
## sample_data() Sample Data:       [ 36 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 28055 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 28055 tips and 28031 internal nodes ]

Fungi

The fungal feature table was also checked to ensure there were only fungi present.

## 
## Fungi 
##  6327

Agglomeration

Both the bacterial and fungal feature tables were then agglomerated to genera level. Statistics such as read counts and sparsity metrics (proportion of cells where count = 0) were generated using the microbiome package. This was to see how the number of reads and taxa change with each step.

Bacteria

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 1118 taxa and 36 samples ]
## sample_data() Sample Data:       [ 36 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 1118 taxa by 7 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 1118 tips and 1117 internal nodes ]
## [[1]]
## [1] "1] Min. number of reads = 13220"
## 
## [[2]]
## [1] "2] Max. number of reads = 32660"
## 
## [[3]]
## [1] "3] Total number of reads = 865546"
## 
## [[4]]
## [1] "4] Average number of reads = 24042.9444444444"
## 
## [[5]]
## [1] "5] Median number of reads = 23657.5"
## 
## [[6]]
## [1] "7] Sparsity = 0.666790896442059"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
## 
## [[8]]
## [1] "8] Number of singletons = 3"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)0.268336314847943"
## 
## [[10]]
## [1] "10] Number of sample variables are: 4"
## 
## [[11]]
## [1] "Site"      "Field_no"  "Treatment" "Group_by"

Fungi

## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 711 taxa and 25 samples ]
## sample_data() Sample Data:       [ 25 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 711 taxa by 7 taxonomic ranks ]
## [[1]]
## [1] "1] Min. number of reads = 30659"
## 
## [[2]]
## [1] "2] Max. number of reads = 78553"
## 
## [[3]]
## [1] "3] Total number of reads = 1329016"
## 
## [[4]]
## [1] "4] Average number of reads = 53160.64"
## 
## [[5]]
## [1] "5] Median number of reads = 54650"
## 
## [[6]]
## [1] "7] Sparsity = 0.697383966244726"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? NO"
## 
## [[8]]
## [1] "8] Number of singletons = 0"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)0"
## 
## [[10]]
## [1] "10] Number of sample variables are: 4"
## 
## [[11]]
## [1] "Site"      "Field_no"  "Treatment" "Group_by"

Aggregation

Feature tables were then aggregated to site level using speedyseq (insert citation) as phyloseq merge_samples combines abundances using sum. Median was felt to be more sample representative. Read counts were again generated to check the aggregation step was performed correctly and metadata was checked to ensure no NAs were coerced into the data.

Bacteria

## phyloseq-class experiment-level object
## otu_table()   OTU Table:          [ 1118 taxa and 12 samples ]:
## sample_data() Sample Data:        [ 12 samples by 4 sample variables ]:
## tax_table()   Taxonomy Table:     [ 1118 taxa by 7 taxonomic ranks ]:
## phy_tree()    Phylogenetic Tree:  [ 1118 tips and 1117 internal nodes ]:
## taxa are rows
## [[1]]
## [1] "1] Min. number of reads = 16398"
## 
## [[2]]
## [1] "2] Max. number of reads = 28026"
## 
## [[3]]
## [1] "3] Total number of reads = 276014"
## 
## [[4]]
## [1] "4] Average number of reads = 23001.1666666667"
## 
## [[5]]
## [1] "5] Median number of reads = 22886.5"
## 
## [[6]]
## [1] "7] Sparsity = 0.692382230172928"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
## 
## [[8]]
## [1] "8] Number of singletons = 484"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)0.0894454382826476"
## 
## [[10]]
## [1] "10] Number of sample variables are: 4"
## 
## [[11]]
## [1] "Site"      "Field_no"  "Treatment" "Group_by"

Fungi

## phyloseq-class experiment-level object
## otu_table()   OTU Table:          [ 711 taxa and 10 samples ]:
## sample_data() Sample Data:        [ 10 samples by 4 sample variables ]:
## tax_table()   Taxonomy Table:     [ 711 taxa by 7 taxonomic ranks ]:
## taxa are rows
## [[1]]
## [1] "1] Min. number of reads = 36120"
## 
## [[2]]
## [1] "2] Max. number of reads = 56936"
## 
## [[3]]
## [1] "3] Total number of reads = 481530"
## 
## [[4]]
## [1] "4] Average number of reads = 48153"
## 
## [[5]]
## [1] "5] Median number of reads = 48798.5"
## 
## [[6]]
## [1] "7] Sparsity = 0.705344585091421"
## 
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
## 
## [[8]]
## [1] "8] Number of singletons = 203"
## 
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n        (i.e. exactly one read detected across all samples)0.281293952180028"
## 
## [[10]]
## [1] "10] Number of sample variables are: 4"
## 
## [[11]]
## [1] "Site"      "Field_no"  "Treatment" "Group_by"

Rarefaction plot

Rarefaction curves were then generated to find sampling depths using the rarecurve function in the vegan package to generate a tibble friendly data frame to be passed to ggplot2 to generate attractive rarefaction plots.

Bacteria - aggregated data

Bacteria - non-aggregated data

Fungi - aggregated data

Fungi - non-aggregated data

Apply rarefaction

Rarefaction was applied to the bacterial and fungal feature tables. The bacteria feature table was rarefied to a sampling depth of 16 398 while fungi was rarefied to a sampling depth of 36 120.

Bacteria

## phyloseq-class experiment-level object
## otu_table()   OTU Table:          [ 627 taxa and 12 samples ]:
## sample_data() Sample Data:        [ 12 samples by 4 sample variables ]:
## tax_table()   Taxonomy Table:     [ 627 taxa by 7 taxonomic ranks ]:
## phy_tree()    Phylogenetic Tree:  [ 627 tips and 626 internal nodes ]:
## taxa are rows
## phyloseq-class experiment-level object
## otu_table()   OTU Table:          [ 1084 taxa and 36 samples ]:
## sample_data() Sample Data:        [ 36 samples by 4 sample variables ]:
## tax_table()   Taxonomy Table:     [ 1084 taxa by 7 taxonomic ranks ]:
## phy_tree()    Phylogenetic Tree:  [ 1084 tips and 1083 internal nodes ]:
## taxa are rows

Fungi

## phyloseq-class experiment-level object
## otu_table()   OTU Table:          [ 509 taxa and 10 samples ]:
## sample_data() Sample Data:        [ 10 samples by 4 sample variables ]:
## tax_table()   Taxonomy Table:     [ 509 taxa by 7 taxonomic ranks ]:
## taxa are rows
## phyloseq-class experiment-level object
## otu_table()   OTU Table:          [ 708 taxa and 25 samples ]:
## sample_data() Sample Data:        [ 25 samples by 4 sample variables ]:
## tax_table()   Taxonomy Table:     [ 708 taxa by 7 taxonomic ranks ]:
## taxa are rows

Taxonomy

Bar plots showing taxonomic composition at a phylum, class and family were generated for each variable being assessed: treatment, site and group (site-treatment). Functions for each type of bar plot were created.

Function for bar plot at family level (cutoff for grouping was set to 2% as a colour palette large enough to show all families present was not available at the time:

Bacteria

Phylum

Treatment
Site
Site-treatment

Class

Treatment
Site
Site-treatment

Family

Treatment
Site
Site-treatment

Fungi

Phylum

Treatment
Site
Site-treatment

Class

Treatment
Site
Site-treatment

Family

Treatment
Site
Site-treatment

Alpha diversity

Alpha diversity calculations were run for both bacteria and fungi to determine if there were differences in diversity between sites, treatments and groups.

Alpha diversity plots were generated

Bacteria

##                                Chao1  se.chao1  Shannon   Simpson
## HARLEY_FARMS_new2b.Control  362.3333 3.4039503 4.802679 0.9839348
## HARLEY_FARMS_new2b.Drought  340.6400 2.8689114 4.637164 0.9803463
## WINDMILL_farm.Control       312.0526 0.2553923 4.748467 0.9840634
## WINDMILL_farm.Drought       402.3030 6.1951083 4.835666 0.9849811
## Castle_Field_West_W.Control 348.7500 2.4097307 4.895923 0.9871872
## Castle_Field_West_W.Drought 352.3333 1.5128057 4.866165 0.9865795
## Jemma_6.Control             368.2759 3.5716413 4.811119 0.9848627
## Jemma_6.Drought             349.3235 1.4677009 4.766784 0.9841443
## Jemma_9.Control             333.9545 4.5211664 4.634408 0.9828976
## Jemma_9.Drought             320.0385 3.0506240 4.589781 0.9819630
## Lardon.Chase.Control        321.6400 2.3337655 4.613960 0.9808540
## Lardon.Chase.Drought        320.5652 3.3753254 4.606299 0.9805645
##          Chao1    se.chao1  Shannon   Simpson
## P19C1 403.2195  4.88241506 4.862187 0.9848159
## P19C2 391.6774  5.31177472 4.876749 0.9853763
## P19C3 397.6875  8.70193023 4.816635 0.9844514
## P19S1 342.3636  3.95687115 4.619438 0.9798207
## P19S2 432.8077  5.84213144 4.854027 0.9845005
## P19S3 362.7895  8.35360799 4.656803 0.9803369
## P20C1 332.1364  0.41939654 4.867252 0.9861849
## P20C2 314.3333  3.31205690 4.732067 0.9839524
## P20C3 392.7632  6.83887130 4.791064 0.9842036
## P20S1 427.0000  6.37667826 4.899164 0.9856368
## P20S2 424.8750 10.08516817 4.842437 0.9846286
## P20S3 445.0196  9.85975813 4.824488 0.9836183
## P2C1  437.7660  6.86367011 5.016905 0.9883510
## P2C2  369.6667  4.41362368 4.900065 0.9870109
## P2C3  374.4545  5.89515694 4.871537 0.9867647
## P2S1  384.6579  3.98204181 4.911651 0.9866959
## P2S2  389.4839  9.33712710 4.877424 0.9871407
## P2S3  389.9767  4.02464189 4.913404 0.9870237
## P54C1 399.6923  5.09510775 4.920525 0.9864512
## P54C2 432.2857 15.30696899 4.830480 0.9852618
## P54C3 402.5714  6.00027513 4.787565 0.9833969
## P54S1 405.5227  6.25307921 4.818479 0.9842408
## P54S2 372.6154  7.71197696 4.777175 0.9841551
## P54S3 354.8857  3.29486351 4.747247 0.9837554
## P55C1 373.5312  7.00594123 4.677237 0.9826638
## P55C2 240.7143  2.99081547 4.275542 0.9759490
## P55C3 400.0000  7.68283400 4.824805 0.9856630
## P55S1 307.6364  0.95905767 4.584209 0.9821484
## P55S2 366.0857  7.06073958 4.678670 0.9830788
## P55S3 338.3333  4.66761917 4.641621 0.9826760
## P67C1 281.0000  0.01721067 4.600609 0.9815854
## P67C2 373.6383  4.56671019 4.653293 0.9805289
## P67C3 380.3077  3.27725056 4.806963 0.9844517
## P67S1 358.3333  4.66762509 4.679350 0.9825411
## P67S2 354.0000  7.21313132 4.572444 0.9789350
## P67S3 330.0370  6.35557578 4.574235 0.9798843

Treatment

Sites - control

Sites - shelter

Fungi

Treatment

Sites - control

Sites - shelter

Post-hoc pairwise testing was carried out for the site and group variables using Dunn test as it is the recommended test to be run after Kruskal-Wallis:

Post-hoc pairwise for alpha diversity

Bacteria

Sites - rain shelter

Beta diversity and ordination plots

Distances between samples were also calculated using ordinate method. The distance metrics selected were weighted/unweighted unifrac and Bray-Curtis

Bacteria

3D plots

Unweighted Unifrac
Weighted Unifrac
Bray-Curtis

Fungi

3D plots

Bray-Curtis
Jaccard

Permanova analysis

Bacteria

Unweighted Unifrac

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = unifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = "terms")
##           Df SumOfSqs      R2      F Pr(>F)    
## Treatment  1  0.05273 0.11294 1.6657  0.013 *  
## Site       5  0.25586 0.54803 1.6165  0.001 ***
## Residual   5  0.15828 0.33902                  
## Total     11  0.46688 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = unifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = NULL)
##          Df SumOfSqs      R2      F Pr(>F)    
## Model     6  0.30860 0.66098 1.6247  0.001 ***
## Residual  5  0.15828 0.33902                  
## Total    11  0.46688 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Weighted Unifrac

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = wunifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = "terms")
##           Df SumOfSqs      R2      F Pr(>F)   
## Treatment  1 0.003162 0.03713 0.8583  0.534   
## Site       5 0.063599 0.74661 3.4523  0.002 **
## Residual   5 0.018422 0.21626                 
## Total     11 0.085184 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = wunifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = NULL)
##          Df SumOfSqs      R2    F Pr(>F)    
## Model     6 0.066762 0.78374 3.02  0.001 ***
## Residual  5 0.018422 0.21626                
## Total    11 0.085184 1.00000                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Bray-Curtis

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = bray_dist.bac ~ Treatment + Site, data = metadata.bac, by = "terms")
##           Df SumOfSqs      R2      F Pr(>F)    
## Treatment  1 0.009715 0.03179 0.7017  0.670    
## Site       5 0.226687 0.74173 3.2749  0.001 ***
## Residual   5 0.069219 0.22649                  
## Total     11 0.305621 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = bray_dist.bac ~ Treatment + Site, data = metadata.bac, by = NULL)
##          Df SumOfSqs      R2     F Pr(>F)    
## Model     6 0.236401 0.77351 2.846  0.001 ***
## Residual  5 0.069219 0.22649                 
## Total    11 0.305621 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Fungi

Bray-Curtis

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = bray_dist.fun ~ Treatment + Site, data = metadata.fun, by = "terms")
##           Df SumOfSqs      R2      F Pr(>F)    
## Treatment  1  0.04056 0.05217 1.0789  0.423    
## Site       4  0.58654 0.75441 3.9005  0.001 ***
## Residual   4  0.15038 0.19342                  
## Total      9  0.77748 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = bray_dist.fun ~ Treatment + Site, data = metadata.fun, by = NULL)
##          Df SumOfSqs      R2      F Pr(>F)   
## Model     5  0.62710 0.80658 3.3361  0.004 **
## Residual  4  0.15038 0.19342                 
## Total     9  0.77748 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Jaccard

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = jacc_dist.fun ~ Treatment + Site, data = metadata.fun, by = "terms")
##           Df SumOfSqs      R2      F Pr(>F)   
## Treatment  1  0.08834 0.06436 1.0251  0.431   
## Site       4  0.93962 0.68451 2.7256  0.002 **
## Residual   4  0.34473 0.25114                 
## Total      9  1.37270 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = jacc_dist.fun ~ Treatment + Site, data = metadata.fun, by = NULL)
##          Df SumOfSqs      R2      F Pr(>F)   
## Model     5  1.02796 0.74886 2.3855  0.002 **
## Residual  4  0.34473 0.25114                 
## Total     9  1.37270 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1